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In this paper, a novel computational technique for finite discrete approximation of continuous dy- 
namical systems suitable for a significant class of biochemical dynamical systems is introduced. The 
method is parameterized in order to affect the imposed level of approximation provided that with 
increasing parameter value the approximation converges to the original continuous system. By em- 
ploying this approximation technique, we present algorithms solving the reachability problem for 
biochemical dynamical systems. The presented method and algorithms are evaluated on several ex- 
emplary biological models and on a real case study. 

1 Introduction 

Under the modern holistic paradigm provided by systems biology f5l, genome-scale knowledge of in- 
dividual components is combined with knowledge of interactions underlying the physiology of living 
organisms. The central goal of systems biology is to integrate all available biological data and to re- 
construct executable models ll20l which allow to investigate complicated behaviour emerging from the 
underlying biochemistry. An important dimension is the quantitative aspect of the data and processes 
being modeled. 

With respect to [19], we consider biological models to be captured by the notion of a biochemical 
dynamical system consisting of variables describing a certain quantity of the respective species in time 
(e.g., number of molecules or molar concentration). Variable values evolve in time with respect to rules 
modeling the effect of reactions. The space of all possible configurations of variable values is referred as 
the state space. 

There exist several modeling approaches that differ in abstraction employed for modeling of time, 
variable values, and molecular interaction effects. The most commonly used approach concerns systems 
of ordinary differential equations (ODE) |28 1 where both time and model variables are interpreted as 
continuous quantities. Effects of interactions are modeled in terms of continuous deterministic updates 
of variables. Variable values represent molar concentrations of the species. In general, the ODE ap- 
proach relies on many physical and chemical assumptions simplifying thermodynamic conditions under 
which particular biochemical phenomena can be modeled correctly |25 1. It is important to note that even 
simple interactions such as second order reactions lead to non-linear ODEs. However, under certain as- 
sumptions, biological systems make specific subclasses of general non-linear dynamical systems. Such 
a specialization motivated development of specific analysis techniques |[T9ll7ll4l[24l. 
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Nevertheless, dimensionality and complexity of biological models preclude satisfactory application 
of analysis methods implying that to explore the model dynamics the only practicable method is numeri- 
cal simulation. Since numerical simulation generates an approximate solution (a trajectory) starting from 
a single initial point in the continuous state space, the scope of such exploration is limited to the partic- 
ular trajectory only. This is sufficient for "local" analysis provided that initial conditions are precisely 
known. However, studied systems are typically under-determined in terms of uncertain quantitative pa- 
rameters and initial conditions. Therefore generalization of the exploration scope is necessary to reveal 
and understand the complicated emergent behaviour. An important example of a problem which cannot 
be effectively solved by local methods is global temporal property - the problem to decide whether a 
given dynamical phenomenon, e.g., oscillation or variables correlation, is globally present/absent for all 
considered initial conditions |[T7ll9ll. 

In this paper we limit ourselves to a subclass of dynamical phenomena representing reachability of 
a given portion of the state space. Example of a global temporal property problem that belongs to this 
subclass is to identify minimal or maximal concentration of species reachable from a particular set of 
initial conditions. 

In general, the reachability problem is undecidable due to unboundedness and uncountability of the 
state space. However, since concentrations of species cannot expand infinitely, state spaces of biologi- 
cal systems dynamics can be considered bounded in most cases. Analysis can be therefore considered 
indirectly on suitable finite discrete approximations of continuous state spaces ll24l[3l. 

For a significant class of biochemical dynamical systems determined by multi-affine vector fields 
(i.e., affine in each variable), there has been developed an over- approximative abstraction technique 
based on partitioning the continuous state space by a finite set of rectangles. Rectangles determine states 
of a rectangular transition system representing the finite discrete (over)approximation of the continuous 
state space |[T4|, as shown in Figure [T^. The rectangular abstraction has been employed in ll24ll for 
reachability analysis and further elaborated by model checking methods in [6|. The results show that the 
extent of spurious behaviour introduced by the abstraction is typically very high thus limiting satisfactory 
application of the method. The problem is based mainly on the fact that a transition between any two 
individual rectangles over-approximates the vector field on the border between the rectangles (a so- 
called facet, see Figure [TJ3) provided that the information regarding which trajectories starting in an 
entry facet evolve through a particular exit facet is abstracted out. This causes the rectangular transition 
system to generate many rectangle sequences in which there is no corresponding trajectory of the original 
continuous system embedded. Moreover, the extent of such spurious behaviour is not directly eliminated 
by increasing the partition density. 

When analysing approximate models as in systems biology, the need for precise results critically 
required in systems verification can be relaxed provided that a suitable approximation of the solution can 
be even more efficient to obtain useful results. Henceforth, in the field of complex systems exhaustive 
techniques are often combined with approximative methods thus making a certain shift in the way of 
applying formal methods lIlQlfTTlfTOl . 

1.1 Our Contribution 

We present a new technique for discrete approximation of biochemical systems with dynamics given by a 
system of ODEs with multi-affine right hand side. Our discrete approximation is not an exact abstraction 
wrt the original continuous system, but rather an approximation that approaches exact reachability with 
decreasing approximation granularity. While still assuming the rectangular partition at the background, 
we employ a measure that enables local quantification of the amount of trajectories evolving on a rectan- 
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gle in a particular facet-to-facet direction. To this end, every rectangle is augmented with a local memory 
representing the information at which part (entry set) of the entry facet it has been entered. On each 
entry set, we identify focal subsets from which all trajectories lead to the same exit facet. In Figure [T]:, 
there are two different states of a quantitative discrete approximation automaton (QDAA) depicted. Both 
states share the same rectangle [1, 1.5] x [1, 1.5] and they differ in entry sets (marked yellow). The upper 
state with entry set {1.5} x [1, 1.5] has only one focal set - all trajectories from its entry set exit the state 
through the facet [1, 1.5] x {1}. The second state with entry set [1, 1.5] x {1.5} has two focal subsets 
made by the green and the red part of the entry facet, respectively. 

Transitions from a state with given entry set have weights assigned to themselves. Consider a tran- 
sition from a state A to a state B. The transition exists if there is a part P of the entry set of A such that 
the trajectories of ODE solutions go from P to B. Weight of a transition from A io B corresponds to 
the (n — 1 -dimensional) volume of P divided by the volume of the entry set of A. In this manner, the 
measure reflects amounts of trajectories proceeding in a particular direction. Rectangle regions related 



by weighted transitions make the QDAA which is a discrete-time Markov chain. (See Theorem [X2| and 
its proof in the full version of this paper available at I16J.) 

From a computational viewpoint, the continuous volumes are finitely approximated by discretization 
on a uniform grid. Local numerical simulations are employed to identify the entry regions and focal 
subsets. The density of facet discretization grid is considered as the method parameter. Because of com- 
bining numerical simulation with rectangular abstraction, the resulting QDAA makes neither an over- 
nor an under-approximation of the original continuous system. Since for every sequence of states the ap- 
proximate volume measure converges to the continuous volume with increasing discretization parameter, 
the parameter indirectly affects the correspondence between the original continuous behaviour and its ap- 
proximation. This makes the method sufficient for approximating reachability in complex biochemical 
dynamical systems. 

In general, the following main contributions are brought by this paper. 

1. A novel computational technique for finite discrete approximation of multi-affine dynamical sys- 
tems by means of QDAA. 



2. Showing that QDAA converges to the original continuous system behaviour. (See Theorem [33 
and its proof in the full version of this paper |[T6l .) 

3. A reachability algorithm for QDAA. 

4. Evaluation on elementary models and an E. Coli case study. 

Since the most common application of the considered systems class is the domain of biochemical 
dynamical systems modeled directly by rules of mass action kinetics [23. L evaluation of the method and 
algorithms is realized on biological models fitting this framework. 



1.2 Related Work 

Discrete approximation methods are commonly used in continuous and hybrid systems analysis (see ^ 
for an overview regarding reachability) to handle the uncountability of the state space. Direct methods 
work on the original system and rely on a successor operation iteratively computing the reachable set 
whereas indirect methods abstract from the continuous model by a finite structure for which the anal- 
ysis is simpler. Our method belongs to the latter class, since it uses numerical simulations and creates 
the abstraction automaton. Considering a fixed set of initial conditions, there is a certain overhead with 
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Figure 1: (a) Vector field of a linear system partitioned by thresholds, (b) the principle of rectangular 
abstraction, (c) and quantification of the extent of over-approximation in terms of transition weights. 
The dashed line inside the rectangle demonstrates the approximate border separating trajectories exiting 
through different facets. 



generating states of the automaton in comparison with simple numerical simulations. However, the ad- 
vantage of constructing the automaton is obtaining a global view of the dynamics. Moreover, in addition 
to rectangular abstraction, the automaton is augmented with weighted transitions which represent quan- 
titative information describing volumes of subsets of initial conditions belonging to attraction basins of 
different parts of the phase space. 

An indirect method based on rectangular abstraction automaton making the finite quotient of the 
continuous state space has been employed, e.g., in EH [DEI- In general, these methods rely on re- 
sults ||T4l EB and are applicable to (piece-wise) affine or (piece- wise) multi-affine systems. Although 
not addressed formally in this paper, our technique can be considered as a refinement of |[24ll . However, 
we focus on obtaining satisfactory approximate results eliminating the extent of spurious behaviour com- 
ing from conservativeness of rectangular abstraction. Our technique can be employed for the recognition 
of spurious behaviour of the rectangular abstraction transition system. 

The technique presented in Il27ll employes timed automata for the finite quotient of a continuous 
system as an alternative to piece-wise linear approximations. Another indirect technique adapted to 
multi-affine biological models is ifTSll . The approach also employes rectangular abstraction, but results 
in less conservative reachable sets by means of polyhedral operations. In |[2l[8l there are techniques pro- 
posed for rectangular refinement that go towards reduction of over-conservativeness. These techniques 
work fine for linear systems while leaving the non-linear systems as a challenge. 

Direct methods are mostly based on hybridization realized by partitioning the system state space into 
domains where the local continuous behaviour is linearized |[T2ll . This method, in an improved form, has 
been applied to non-linear biochemical dynamical systems |[T8ll . In general, direct methods give good 
results for low-dimensional systems and small initial sets. In comparison with indirect approaches, they 
are computationally harder. From this viewpoint, our approach lies between both extremes. 



2 Preliminaries 

2.1 Basic definitions and facts 

Let N denote the set of positive integers. No the set N U {0}, and Mq the set of nonnegative real numbers. 
For n G N, denote W the standard fz-dimensional Euclidean space with standard topology and Euclidean 
norm | • | : ^ Rq . For an arbitrary function / we use the common notation dom{f) for the domain of /. 
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For every / G {1, . . . assume ai^bi G M such that at < bi. Denote / HLi i^ii^i] n-dimensional 
closed interval in W- and vol{I) the n-dimensional volume of I defined as vol{I) — IlLi (^^ ~ ^i)- Further 
denote Inter {I) the interior of I, defined as the cartesian product of open intervals HLi (^n ^0- 

For any X C denote the Lebesgue outer measure (on W) of the set X. Basically 

is the minimal nonnegative real number such that whenever X can be covered by a sequence of closed 
intervals in W the sum of volumes of these intervals is greater then or equal to (For precise 

definitions see |30|.) Note that A*(X) < oo for every bounded set X and A*(/) = vol{I) for every n- 
dimensional interval /. 

Let n > 2, / < c G M. We use M^"^ (c) to denote the hyper-plane R^"^ (c) = {(xi , . . . G R"" | x,- = 
c}. Denote fti : R^ the projection omitting the /th variable, ^/((xi, . . . ,x„)) = (xi, . . . 

...Xn). Let X C R^~^ (c). We extend the notion of the {n — 1) -dimensional Lebesgue outer measure to 
such sets X and denote {X) the {n — 1) -dimensional Lebesgue outer measure of 7ti{X). 

Let / : R" ^ R^ be a continuous function (an autonomous vector field). We say that 

x = /(x) (1) 

is an autonomous ODE system. An important property of autonomous systems is the fact that if y{t) is a 
solution of ([T]) on an open interval (a, then };(/ + /^o) is also a solution (defined on interval {a — t^^b — 

A function / : R" ^ R^ satisfies the Lipschitz condition locally on R^, if for every x G R" there exists 
an open set (7 C R", x G (7 and a constant L G R such that for every two points xi ,X2 G the inequality 
|/(xi) -/(x2)| <L - |xi -X2I holds. 

Theorem 2.1 (Trajectories of solutions of an autonomous system) Let Q be an autonomous system, 
where f is defined on R^ and let f satisfy the Lipschitz condition locally on R^. Let x be an inextendible 
solution of system ([7|. Then dom{x) is an open interval, and for every point a G R" there exists exactly 
one trajectory of an inextendible solution x{t) of system ([7| coming through a. 

Theorem 2.2 (Continuous dependency on initial conditions) Let f \ W be continuous on an 

open set £" C R^ with the property that for every ^ E, the initial value problem x — f{x) , x(0) — has 
a unique solution y{t) = ri{t,yo) (Vj is a function of variables t,yo ). Let w±,wj- G R such that {w±,wj) 
is the maximal interval of existence ofy{t) — Tj^t^yo). 

Then the bounds w±,wj are (lower, resp. upper semicontinuous) functions ofyo in E and Ti(t^yo) is 
continuous on the set {{t,yo) \ yo G E,w±{yo) <t< wj{yo)} C R"+^ 

We restrict ourselves to multi-affine autonomous systems. That is, systems of the form ([T]), such 
that the vector field / is a multi-affine function, defined as a polynomial of variables xi , . . . G R" of 



degree at most one in every variable. The assumptions of Theorems |2. 1 1 and |2.2| (from [22J) are satisfied 
for systems of this class, therefore the properties stated in the above theorems can be used for reasoning 
about autonomous systems with multi-affine vector fields. 



2.2 Biochemical dynamical system 

According to II19L by a biochemical dynamical system we understand a collection of n biochemi- 
cal species interacting in biochemical reactions. Species concentrations are represented by variables 
xi, . . . ,x^ attaining values from Rq . If the stoichiometric coefficients in reactions do not exceed one 
and the reaction dynamics respects the law of mass action kinetics ll23l , the dynamical system can be 
described by a multi-affine autonomous system in the form ([T]). 



102 



Reachability in Biochemical Dynamical Systems 



In a biochemical dynamical system we are typically interested in a bounded part (/i-dimensional in- 
terval) of the phase space in Further, we consider the phase space partitioned by a (non-uniform) 
rectangular grid. In particular, for each variable there is defined a finite set of thresholds, making the sys- 
tem partition. Thresholds determine (n — 1) -dimensional hyper-planes in and can be freely specified 
according to particular questions that should be addressed by the model analysis, e.g., specification of 
unsafe or attracting sets. Cells laid out by 2n adjacent threshold hyper-planes (cells are again intervals in 
W) are called hyper-rectangles, for short we refer to them as rectangles. 

Definition 2.1 Define a biochemical dynamical system (biochemical system /or short) as a tuple = 
(fz,/, 3^ ,J^c)y where 

• n is the dimension of^, 

• / : ^ is the multi-ajfine vector field of 3^, 

• 3/^ — {T\^ . . . ^Tn) is the partition of 3^ where each Tf is a finite subset ofW^, and define the set of 
rectangles given by 3^ as 

n 

Rect{3) {Wlj I \/j3a,b E Tj .Ij [a,b]yc G Tj :c<aWc>b}, 

• -^c ^ Rect{3) is the set of initial conditions (initial setj of^. 

Definition 2.2 Let ^ — (n, /, J^c) be a biochemical system and let H G Rect{3) be a rectangle such 
that H = Ii X ... xin, where It [ai^bt]. For every / G {1, . . . ,n} define the lower (resp. upper) facet of 
H wrt the ith variable: 

Facet^{H) — {(xi, . . . \ xi — at}, 

Facet] {H) = {(xi, . . . ,x^) ^H\xi^ bi}. 

Denote Facet Si{H) the set of /th dimension facets of H, Facet Si{H) = Facet] {H) U Facet] (H), and 
Facet s{H) the set of (all) facets of H, Facet s{H) = IJ^^^ Facet Si{H). 

Definition 2.3 Let H, H' G Rect{ 3). We say that H is a neighbour of H', denoted H M H', if there exists 
F G Facets (//) such that Hr\H' ^F. 

3 Quantitative Discrete Approximation 

Given a biochemical system ^ = (^,/, 3 ^J^), we aim to define a finite automaton reflecting the be- 
haviour of SS, and for each state, to assign every transition a weight quantifying probability of proceeding 
to a particular successor. 

A state is defined as a pair - a rectangle H, and a subset £" of a particular facet of H. The set 

F represents a so-called entry set, a region through which trajectories of the system ([T]) enter the interior 
of H. Intuitively, we can say that F encodes the history of previous evolution of the system from initial 
set J'c to H. Entry sets are either subsets of [n — 1) -dimensional facets of H or (in case of initial states) 
the whole ^-dimensional rectangle H. 

Since entry sets can be arbitrary sets in Euclidean space, we approximate them by a finite discrete 
structure. Each facet is provided with a uniform grid on which we approximate any subset of the facet 
by the set of rectangular fragments, so-called tiles (Figure Is]). The grid is ^-dimensional or — 1) 
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Figure 2: Example of a biochemical system with two species and two reactions. Dynamics given by 
a system of two ODEs and the system of thresholds are in the left part of the figure. Vector field is 
visualized in the middle, and its Rectangular Abstraction Transition System on the right. 

dimensional depending on the dimension of approximated entry sets. When following the trajectories of 
solutions of differential equations of the models dynamics in time, entry sets are identified by trajectories 
of solutions passing through them on their way from preceding rectangles. In following definitions we 
treat this intuitive perception of entry sets formally. 

Let K* G N, let ^ = {n,f, ^, J^c) be a biochemical system, H G Rect{3'), and F G Facets{H) for all 
definitions and theorems from this section. 

Definition 3.1 Let H be of the form H YYj^i Ij, where V j : Ij [aj^bj]. Let B G {H} U Facets{H). Set 
either n' — n, ifB — H, or n' — {n— 1), ifB G Facets i{H) for some \ <i<n(in this case 3c G {at^bi] : 
B(lWl-\c)). 

Define the set of /c-tiles of B as Tiles^,{B) = {A C 5 | A = Il^^i^j}. where At = {c}, if B e 
FacetSi{H), and otherwise (j ^iorB — H)Aj is a closed interval in Rq oftheform [aj^^{bj — aj)^aj + 
^^{bj — aj)], where for all j G {1, . . . ,^}, J ^ i the nonnegative integer kj G No satisfies kj < K. 

The following definition introduces the notion of general entry sets. 
Definition 3.2 Define the set of entry points into a rectangle H through facet F, as the set 

Entry {F^H) = {yo G F 1 3 a trajectory y{t) of a solution of ([T]) such that 3;(0) = 

and3e>0:};(0 G//forV^ G (0,e)}. 

Next we define the approximation of entry sets on a grid of /c-tiles. Additionally, we define the 
respective (discrete) volume measure of a set (see Figure[3]c),d)). 

Definition 3.3 LetX C H. Let n' — n—\, if there exists / G {1, . . . G Facet Si{H) such thatX C F, 

and let n' — n, otherwise. Let M — F, ifX C F, and let M — H, if there is no such facet F. Define 

• the set of /c-tiles approximating the set X as 

A;,(Anx 



Tiles'^, {X) = <^ A G Tiles'^, {M) 




• the rectangular /c-grid measure of the setX as ^^i(X) = LAerto^(x) V6>/(A). 

The following definition declares the set of all discretized entry sets for a given rectangle. 
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a) b) c) d) 

Figure 3: a) Let H = [0,2.5] x [0,2.5] be a rectangle. The blue areas depict elements of Tiles^iH). 
b) Let F = FacetJ{H) = {2.5} x [0,2.5]. The red line segments are elements of Tiles\{F). The set 
EntrySets^ (//) has 2 + 4 • ( 1 + (2) + (1) ) = 30 elements: 0, //, and 7 for every facet of H (the facet itself, 
3 segments and 3 unions of pairs of segments of the facet), c) Let X be a subset of H (the shaded polygon). 
Let /c = 5. d) The set of /c— tiles approximating X is the set of five blue intervals (each satisfying the fact 
that at least half of its area is in X). The cardinality of Tiles2{X) is 5. Thus ^O^) = 5 • (0.5 • 0.5) = 1 .25. 

Definition 3.4 For H, define set of (approximate) entry sets Entry Sets ^{H) = {ECH\E = Q)\/E = 
HW3F G Facets{H),(g C Tiles ^ {Entry {F,H)) : E = U^}- 

For an example of a set of (approximate) entry sets of a rectangle see Figure |3] a), b). Note that set 
of approximate entry sets is always finite. Further note that also the empty set and the entire rectangle 
are considered as entry sets. These represent singular cases needed in the subsequent construction of 
the automaton. In particular, states with the empty entry set approximate fixed point behaviour not 
leaving the rectangle (steady state memory) whereas the rectangle-form entry set is employed for initial 
rectangles. 

Definition 3.5 Let E G EntrySets^{H),H' G Rect{^),F' E Facets{H) such thatH't^H.F' =HnH\ 

Define the focal subset of E on H targeting F', denoted Focal{H^E^F'), as the set of all yo ^ E 
such that there exist e,e',c > and a trajectory of a solution y{t) of system ([7J with inital conditions 
37(0) =370 satisfying y{t) G Hfort G (0,c),y(^) G Inter{H)fort G (0,£), y{c) G F' , andy{t) G Inter{H') 
for G (c, c + e'). Let ExitSet{H ^E ^F') denote the set of all such ( targeted) points y{c) G F'. 

Define focal subset ofEonH not leaving H, Focal{H,E,(d), as the set of all points yo ^ E such 
that there exists a trajectory of a solution y{t) of system ([7]) with initial conditions y{0) — yo satisfying 
y{t) G Hfor all t > 0. 

Next we define the successor function for any pair {H^E) and subsequently the quantitative discrete 
approximation automaton. 

Definition 3.6 Let E G EntrySets^{H). Define the successors of (H^E) as the set of pairs {H' ^E') with 
H' G Rect{^),E' G Entry Sets ^{H') such that 

Succs{{H,E)) = {{H'^E^) \H' ^E' satisfy one of conditions 1. — 3. below} 

7. H' t^H, E ^(d. Denote F' the facet of H satisfying F' = HDH^ Let n' ^ n, if E ^ H, and 
n' — {n — 1), otherwise. Moreover, E' — \JTiles^{ExitSet{H ^E ^F')) and X^,{Focal{H^E^Qi)) > 0. 

2. H' ^H,E^ 0, and E' 0. Further, it holds that either E CF and X^_^ {Focal{H ,E ,^)) > 0, or 
E^H and?i^{Focal{H,E,(d)) > 0. 

3. H'^HandE'^E^id. 
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Definition 3.7 (The Quantitative Discrete Approximation Automaton) Let k*, ^ be as above. The 
quantitative abstraction automaton QDAAk{^) of a biochemical system SS with parameter K is a tu- 
ple QDAA^{^) = {S,J^c,S,p), where 

• the set of states S={{H,E)\H e Rect{^),E G Entry Sets ^{H)}, 

• the set of initial conditions Ic = {{H^H) \ H G ^c}y 

• the transition function 8 '.S is defined as 8{{H^E)) = Succs{{H^E)), 
the weight function p : S x S ^ [0,1] is defined by the following expression, where 5 5' = 
{H' ^E'). Suppose n' — n, in case E —H^ and n' — n—\, otherwise. 



1, if H = H',E = E' = (d, 

?i'',(Focal(H,E,m) 

I.AeFacets{H)U{(d} K' V^^^^y^ ^^)) 

X^^{Focal{H,E,F')) 

l.AeFacetsiH)u{(d} K' {Focal{H ,E ,A)) 
0, otherwise 



Example 3.1 Assume the biochemical system from Figure^ See Figure^ a) for an example of fo- 
cal subsets described below. Let R — [0,2.5] x [2.5,5] be a rectangle and let Fq = Facet^{R)^F\ — 
FacetJ{R),F2 — Facetj; {R) , Fi, — Facet^{R). For the state (R^Fq) the focal set ofFi equals Fq, whereas 
Focal (Fq) = Focal (F2) = Focal (F^) = 0. 

Let H — [0, 2.5] x [0, 2.5] and F — FacetJ (R). For the state {H^H) the set Focal{F) is the blue area 
inside H and Focal{(d) is the yellow area. All the solutions of the biochemical systems dynamics with 
initial conditions in Focal{(d) approach the yellow line of fixed points and stay in H forever. All the 
solutions starting in the blue area leave H infinite time through F. 

In the right part of Figure^is the set of reachable states of the quantitative discrete approximation 
automaton (QDAA) obtained from the biochemical system described in Figure^with initial conditions 
J^c = {[0,2.5] X [0,2.5]}. 

Let H,R be the same as above. Let S — [2.5,5] x [0,2.5] and let J^c — {H}. The QDAA successor 
states of{H^H) are {H^(d) (a selfioop state) and (S^E) (where E denotes the K-tiles approximation of the 
red segment in Facet f{S) ). For K ^ 00 the weights of these two transitions approach the area ratios of 
yellow and blue regions ofH respectively. The only successor of (//, 0) is (by definition) itself. The state 
{S^E) has one successor (5, 0), since all the trajectories beginning in E approach the line of fixed points 
and stay inside S forever. 

Therefore the set of concentrations reachable from initial rectangle H is [0, 5] x [0, 2.5]. See the rect- 
angular abstraction transition system from Figure^where the set reachable from H is [0,5] x [0,2.5] U 
[2.5,5] X [2.5,5], although there exists no trajectory of a solution of the biochemical systems dynamics 
that starts in H and reaches a point inside [2.5, 5] x [2.5, 5] . 

On the other hand, if K is too small, some behaviours of the system are not reflected in QDAA, 
because the set of K-tiles corresponding to the entry set may be empty. With finer partition into K-tiles 
smaller entry sets can be captured and approximation of the biochemical system by a QDAA is more 
realistic. 



In the next theorem we ensure correctness of using the Lebesque measure in Definition 3.7 We 
ensure that there is no non-zero volume entry set such that all trajectories from this set lead to a facet 
without entering the interior of a neighbouring rectangle. For the proofs of following three theorems see 
the full version of this paper available at |TF|. 
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Figure 4: a) Focal sets examples, b) QDAA example. 



Theorem 3.1 Let E G Entry Sets ^(H),E ^ 0. Further, let n' — n, ifE — H, and n' — {n— 1), otherwise. 
Then 

X^,{Focal{H,E,A)) > 0, (2) 

AeFacets{H)U{(d} 

Theorem 3.2 The quantitative abstraction automaton QDAA^(^) of a biochemical system 3§ is a dis- 
crete time Markov chain. 

Finally, we provide a theorem suggesting that for sufficiently large values of parameter K*, the rectan- 
gular K*-grid measure of a bounded set X contained in the phase space of biochemical system approaches 
its Lebesque outer measure. For proof of this theorem see IIT6II . 



Theorem 3.3 LetXCH e Rect{3"). Then 

K^oo 

Note that the result applies also to the case with X CF e Facets (H) and , ^n-i- 



(3) 



4 Algorithm 

This section introduces procedures for obtaining the reachable state space of the quantitative discrete 
approximation automaton. Algorithm [T] is a procedure of computing the set of reachable states. Al- 
gorithm [2] describes the computation of transitions from one state (i.e. successors) together with their 
weights using numerical simulations. 

The procedure of computing reachable state space (Algorithm [T]) is based on breadth first search. 
States corresponding to initial conditions of the biological system are enqueued first and a list of states 
already visited is maintained. The computation is always finite, because there are only finitely many 
possible states of the automaton and each of them can be at most once added and after the computation 
of its successors removed from the queue. 
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Algorithm 1 Computing the set of reachable states 

Require: ^ = {nj, ^, ^c), ^^^^ 

Ensure: Reachable = set of all reachable states of the automaton QDAAk(^) 

1 : Reachable ^ 

2: for all e J^c do 

3: s^(H,H) 

4: Reachable ^ Reachable U{^} 

5 : Queue .pushB ack(5) 

6: while Queue / do 

7: s ^ Queue. firstElement 

8: A ^ getSuccessors(^) 

9: forallflGAdo 
10: if <3 ^ Reachable then 

1 1 : Reachable ^ Reachable U{a} 

12: Queue.pushBack(fl) 

13: return Reachable 



Computation of the successors (Algorithm [2]) of one state requires determining the rectangles and 
the entry sets of the successors and weights of the transitions. This can be done approximately using 
numerical simulations. We sample the entry set of the state and perform numerical simulations with 
the sampled points as initial conditions and the dynamics of the given biological system as the vector 
field. For each simulated trajectory we watch whether it leaves the rectangle before given maximal time 
interval elapses. If this is the case then the location of the exit point through which the trajectory leaves 
the rectangle is of interest. 

Entry sets of the successor states are also determined within Algorithm [2j If the successor is a 
selfloop state the entry set is empty. For a neighbouring rectangle successor with one common facet the 
entry set is computed using the exit points locations and more numerical simulations. From the set of 
exit points in a facet we can estimate the set of K*-tiles of the facet that surely have nonempty intersection 
with the exit set. It remains to decide in which of the /c-tiles the intersection of the tile with the exit set 
takes at least one half of the volume of the tile. 

To this end we use numerical simulations and the fact that for an autonomous system of ODEs 
X = f(x) with a solution x{t) the function x{—t) is a solution of autonomous system x = —f{x). For 
determining whether to include a /c-tile in the entry set of a successor state, we sample the tile and 
perform numerical simulations of the trajectories of system x = —f{x). If more than one half of the 
simulated trajectories go through the rectangle and the entry set of the original state, then the /c-tile is 
included in the entry set of successor state, otherwise the K*-tile is not included. 

Weights of the transitions correspond to portions of the set of performed simulations that leave the 
rectangle to the respective neighbouring rectangles. Weight of the transition from the state to the so- 
called selfloop state with the same rectangle is determined as the portion of trajectories that do not leave 
the rectangle in given maximal time interval. 

Performing backward simulations (lines [T6] - [24l of Algorithm [2]) can be switched off. The resulting 
transition system differs from the QDAA in the entry sets, that can be larger. Difference of the outputs 
can be seen on FigurejS] The algorithm with backward simulations computes the QDAA and for (/c ^ oo) 
approaches the real behaviour of the solutions of dynamics ODE system. On the other hand the algorithm 
without backward simulations overapproximates the entry sets, therefore the transitions are included 
even if the entry set of a state is smaller than half of one /c-tile. Both options still lead to automatons 
with reachable states whose rectangles are included in the set of reachable rectangles of the rectangular 
abstraction with the same initial rectangles. 

The worst case complexity of the algorithms follows. There are at most rectangles in the phase 
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Algorithm 2 Procedure getSuccessors 

Require: ^ = ^, J^), K,MeN,He Rect{^), E e EntrySets{H) 
Ensure: Successors = 5Mcc5K:((-f^,^)) 



1: if£ = 0then 

2: Successors ^ {(/^,0)} 

3: return Successors 

4: A ^ set of M random points in E 

5: ExitPoints^0 

6: Stayslnside ^ 

7: for allxo G A do 

8: simulate trajectory from xq until it leaves H through a point x\ or given time elapses 

9: if XI exists then 

10: ExitPoints ^ ExitPoints U{xi } 

1 1 : else 

12: Stayslnside ^ Stayslnside +1 

13: fov2i\\VeFacets{H),V^HnH' 

14: if ExitPoints n F / then 

15: Entry Tiles ^ {Z G Tiles^l^) | Zn ExitPoints / 0} 

16: for all Z G EntryTiles do 

17: 5 ^ set of M random points in Z 

1 8 : RealPointsCount ^ 

19: for all JO G 5 do 

20: simulate trajectory from jq until it leaves through a point _yi or given time elapses 

21: if};iG£then 

22: RealPointsCount ^ RealPointsCount +1 

23: if RealPointsCount < f then 

24: EntryTiles ^ EntryTiles \{Z} 

25 : if EntryTiles / then 

26: Successors ^ Successors [J {H' ^EntryTiles) 

27: Weight[(//,£)][(//,£^rr);r//^.)] ^ |ExitPointsnF| 

1^1 

28: return Successors 



space of the biochemical system, where k is the maximal number of thresholds on one variable. The 
maximal number of states of QDAA of the form {H^ E) for a fixed rectangle H is2n- (2'^" — l) , where 
n is the dimension of the biochemical system. For the average numbers of visited different states of 
QDAA with the same rectangle encountered while analysing our evaluation models see the line labeled 
p in Table [1} Complexity of the computation of successors of a given state depends on the dimension 
of the system, the K parameter and on the number of simulations M used per one tile. In the worst case 
when all the tiles are examined (either as a part of entry set or potential exit set) there are 2n • K^~^ M 
simulations. 

Visualization of the state space of QDAA involves highlighting the borders of the rectangles H such 
that there is at least one state {H,E) visited during the computation. The intensity of the fill colour of 
a rectangle H is calculated proportional to the sum of weights of all possible paths from initial set J^c 
to the first appearance of states with H as the rectangle. The weight of a finite path is obtained as the 
product of weights of the subsequent transitions in the path. The sum is always between zero and one. 

5 Evaluation and Case Study 

In this section the state spaces of several biological models (of dimensions two, four and seven) are 
explored. Using our prototype implementation of the algorithms from Section |4] implemented in C++, 
we evaluate our approach on two exemplary biochemical systems. Additionally, we provide a case study 
held on a biochemical pathway studied in E. coli and compare the reachability results of the case study 
and one of the smaller models with results obtained using the rectangular abstraction approach. 
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Figure 5: Reachability in oscillatory model and comparison with numerical simulation, first two figures 
were obtained using the full version of Algorithm [2| the third one with lines 16 - 24 omitted. For com- 
parison: using the rectangular abstraction transition system on this biochemical model, the whole phase 
space [0, 30] x [0, 12] is reachable from the same inital conditions. 





Oscillatory 


Enzyme 


K 
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16 


32 


64 


128 
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7 




52 


46 


40 


39 


37 


35 


76 


104 


123 


166 


p 


1.63 


2.2 


3.78 


2.9 


4.57 


6 


4.36 


10.76 


16.8 


53.6 



Table 1 : Results for the two models and several different settings of the discretization parameter K. 



Before we proceed with the models, let us introduce several terms useful for the evaluation. For a 
biochemical system ^ = (^,/, ^c) we denote ^(j^c) ^ Rect{^) the set of all rectangles reachable 
from initial set J'c- For each H G Rect{3/^) we denote mem{H) the subset of ^{^c) consisting of 
all states reachable from the initial set with H as rectangle, the so-called memory of the rectangle H, 
mem{H) = {{R^E) G ^{J^c) \ R = H}. Further we denote p the average number of memory states 
(cardinality of mem{H) averaged over all H G ^(j^c))- The number of QDAA states representing the 
memory of a rectangle is in the worst case equal to the number of all its possible entry sets. However, 
the actual values of p in our examples are much smaller (see Table [T]). 

Let us focus on the effect of parameter k on cardinality of ^{^c) and on p. Expected behaviour of 
the approximation is the following. Every facet is divided into K^~^ tiles. A tile is included in the entry 
set E of some reachable state {H^E) if the focal subset Focal{H^E^A) fills at least half of the volume of 
the tile. For higher values of /c, the set Tiles^i{Focal{H ^E ^A)) better approximates the set Focal{H^E^A) 
because of the higher /c-grid resolution. Thus with increasing K*, the quantitative information denoting 
the probability of reaching states in M{^c) can be computed more precisely. We demonstrate that on 
models examined below. 

First, we consider a 2-dimensional model which is a variant of Lotka-Volterra model with oscil- 
latory behaviour. Details of the dynamics, threshold concentration values and initial conditions of all 
experimented models can be found in the full version of this paper 1 16 |. Results achieved on our im- 
plementation are presented in Table [T] and visualized in Figure [5] Black rectangles denote the initial set. 
Similarly, we examined a 4-dimensional model of basic enzyme kinetics. Projection of the approximated 
phase space to the enzyme/substrate plane is shown in Figure [6j For both the oscillatory model and the 
enzyme kinetics model full version of Algorithm [2] (with backward simulations) was used. 
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Figure 6: Enzyme kinetics model - projection of the reachable set to the enzyme/substrate plane and 
comparison with numerical simulation. 



AmtB + NH4ex AmtB : NH4 y^i = 5 • 10^ = 5 • 10^ 

AmtB : NH4 ^ AmtB : NH3 + Hex h = 50 
AmtB : NH3 AmtB + NH3 in = 50 
NH4in ^ = 80 

NH^in + Hin NH4in ke = l- lO^^y^v = 5.62 • 10^ 

NH^in 1^-^ NH^ex kg = kg = 1.4- 10'^ 

Figure 7: Ammonium transport model (left). Simulations of the ammonium assimilation model from 20 
randomly sampled points in J^c projected on the concentration of NH/^in, blue lines represent bounds on 
this concentration found by the QDAA - two subsequent thresholds 10~^, 10~^ (right). 




5.1 Case Study on E.Coli Ammonium Assimilation Model 

We consider a model specifying the ammonium transport from the external environment into cells of 
E. Coli [26J . The model describes the ammonium transport process that takes effect at very low ex- 
ternal anmionium concentrations. In such conditions, the transport process complements the deficient 
ammonium diffusion. The process is driven by a membrane-located ammonium transport protein AmtB 
that binds external ammonium cations NH/^ex and uses their electrical potential to conduct NH^ into 
the cytoplasm. In Figure [Tj biochemical reactions of this model and the scheme of the transport chan- 
nel are shown (left and middle). The initial conditions of the species concentrations considered for the 
ammonium transport model: 

^c: NHsexe (28 • 10-^29 • 10"^), 7V//4^x G (49 • 10-^5 • lO"^), 

AmtB G (0, 1 • 10-^), AmtB : NH3 E (0, 1 • lO-^),AmtB : NH4 E (0, 1 • lO"^), 
NHsin E (1 • 10-^ 11 • lO-^),NH4in E (2- 10-^21 • IQ-^). 

The level of pH and external ammonium concentration are considered constant. For the system of 
ODEs and list of thresholds of this biological model see the the full version of this paper |[T6l . 

The upper bounds on concentrations of NH^in and NH/^in considering the biological system with 
given initial conditions were estimated as 1.1 • 10~^ (NH^in does not exceed the initial concentration) 
and 5.4 • 10""^ by the rectangular abstraction (overapproximation). 

Reachable intervals using Algorithm [2] without the backward simulations were [10-^l.l •10-^] for 
NH^in (NH^in does not exceed the initial concentration), and [10~^, 10~^] for NH/^in. This results are in 
agreement with simulated data and in the case of the concentration of NH^n the QDAA results are by 
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one order closer to numerical simulations than the rectangular abstraction results as can be seen in the 
right part of Figure [7] 

6 Conclusion 

We have presented a new theoretical method for finite discrete approximation of autonomous contin- 
uous systems equipped with a measure that indirectly quantifies correspondence of the approximated 
behaviour with the original continuous behaviour. We have provided a computational technique which 
we implemented in a prototype software. We have examined the implementation on small dimensional 
models which showed satisfactory results for computing reachability. 

The method can be either used as a parameterized simulation technique or employed with rectangular 
abstraction to quantify the extent of spurious counterexamples. Thus the method can improve the current 
possibilities of analysis based on model checking techniques. We leave for future work integration of 
this method into the software for model checking of biochemical dynamical systems lfT3]| . 

At the theoretical side, we leave for future work precise clarification of our method wrt the rectangu- 
lar abstraction. From the computational viewpoint, we aim to develop a parallel reachability algorithm 
that would make the method scalable and applicable to systems of larger dimensions. 
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